7.1 Computational Histories

We conclude the discussion of undecidability with a few more examples of languages that one might not expect to be undecidable.

Definition

The configuration of a Turing Machine at some point of time consists of the the content of the tape, the position of the head on that tape, and the state of the DFA. > We can represent any configuration of a TM using a string \(wqv\) where > >- \(w\) is the string of symbols on the tape to the left of the head; >- \(q\) is the DFA controller’s current state; >- \(v\) is the string of symbols at and to the right of the head. > The infinite blank part of the tape to the left of \(w\) and the right of \(v\) is left implicit.

Even though the tape is infinite, at any instant during a computation there are only finitely many non-blank symbols on the tape. The TM starts with a tape that is blank except for the finite input string, and after finitely many steps the TM can have written to only finitely many squares on the tape.

Definition

A Computation History is a step-by-step (finite) trace of a Turing Machine’s computation, represented as a string. > For example if a TM takes threes step through the successive configurations >\[ >q_011\ \Rightarrow\ 1q_01\Rightarrow\ 11q_0\ > \Rightarrow\ 111q_a >\] >then we can represent this history as a finite string: >\[ >\#q_011\#1q_01\#11q_0\#111q_a\# >\] > Note: If the \(\#\) character might have been written to the tape as part of the input or computation, we would use a different separator character.

Definition

A history checker for \(M\) accepting \(w\) is a Turing Machine that takes an input string \(h\) checks whether \(h\) is a valid computation history showing \(M\) accepting \(w.\) > Such a machine always exists for any \(M\) and any \(w\); it just needs to check whether > >- the input \(h\) consists of configurations separated by \(\#\); >- the first configuration has the TM’s head in its start state positioned at the beginning of the string \(w\); >- successive configurations differ only around the head, in exactly the ways corresponding to the \(M\)’s transitions; and >- the last configuration has the TM in an accepting state.

Note that a history checker for \(M\) accepting \(w\) always exists whether or not \(M\) accepts \(w\)! If \(M\) accepts \(w\) then there is a history showing the trace of that computation, and this string will be accepted by the history checker. But if \(M\) does not accept \(w\), there is no such finite trace. As a consequence, any purported trace we give to the history checker will be found to be flawed, and the history checker won’t accept any strings at all.

This observation immediately gives us a different way to prove a known fact:

Example

Here’s another proof that \(E_\mathrm{TM} = \{\ \ \langle M\rangle\ |\ L(M)=\emptyset\ \}\) is not semidecidable. > Proof. We will show that \(\lnot A_\mathrm{TM} \leq_m E_\mathrm{TM}\). > Let \(f\) be the function that turns \(\langle M,w\rangle\) into the code of a history checker for \(M\) accepting \(w\). > >- If \(\langle M,w\rangle \in \lnot A_\mathrm{TM}\) then \(M\) does not accept \(w\), so there no computation history that shows how \(M\) accepts \(w\), so there are no strings accepted by the corresponding history checker and its language is empty. >- If \(\langle M,w\rangle \not\in \lnot A_\mathrm{TM}\) then there is a computation history showing how \(M\) accepts \(w\), so the corresponding history checker accepts that string and its language is not empty.

Undecidability and CFLs

The idea of computation histories lets us prove surprising facts about context-free languages.

Example

While it is easy to decide whether a context-free grammar generates any strings at all (whether its language is nonempty), it is undecidable whether a context-free grammar can generate all possible strings! > That is, the language >\[ >\mathit{ALL}_\mathrm{CFG}\ = \{ \langle G\rangle\ |\ L(G)=\Sigma^*\ \} >\] is not decidable. > Proof Sketch. One can show that \(\lnot A_\mathrm{TM} \leq_m \mathit{ALL}_\mathrm{CFG}\). > The fundamental trick is that given \(\langle M,w\rangle\), the strings that are not valid computation histories are (with one small hack) context-free. > So asking whether \(M\) fails to accept \(w\) is equivalent to asking whether all strings are not valid histories showing \(M\) accept \(w\), which in turn is equivalent to asking whether the language of CFG generating strings that are not valid histories includes all possible strings.

To see why strings that are not valid computation histories are context-free, it is a little easier to think about PDAs rather than grammars. A trace is not a valid history of \(M\) accepting \(w\) if it has at least one error, such as:

The first configuration does not have the TM in its start state, or \(w\) is not on the tape;
The final configuration does not have the TM in its accepting state; or
There are two consecutive configurations that don’t match up according to the transitions of the TM.

We can find each sort of error using a PDA, and then can create one “super” PDA that nondeterministically chooses which kind of error to look for (and in the case of consecutive configurations, nondeterministically choose where in the string to find those configurations). If an error exists, the nondeterministic PDA is guaranteed to find it. (And if a PDA exists, then there is a corresponding CFG.)

The small hack mentioned above: when comparing consecutive configurations, PDAs are not very good at determining whether two arbitrary strings (tape contents) are the same or almost the same. But their stack makes them good at palindromes, checking whether one string is the reverse (or almost the reverse) of another. So for this problem we use a slightly modified definition of computation history in which every other configuration in this history is reversed, i.e., instead of \[ \#C_0\#C_1\#C_2\#C_3\#\cdots \] we have \[ \#C_0\#C_1^R\#C_2\#C_3^R\#\cdots. \] In this way, even a PDA can check that \(C_i\) and \(C_{i + 1}\) match up appropriately for a particular TM \(M\).

Example

It is not even semidecidable whether two CFGs parse the same strings (i.e., produce the same language). That is, >\[ >\mathit{EQ}_\mathrm{CFG} = \{\ \langle G_1,G_2\rangle\ |\ L(G_1)=L(G_2)\ \} >\] is not decidable and not semidecidable. > >Proof. We can show \(\mathit{ALL}_\mathrm{CFG} \leq_m \mathit{EQ}_\mathrm{CFG}\) using a mapping that sends \(\langle G\rangle\) to the pair \(G, G_\mathit{all}\) where \(G_\mathit{all}\) is a context-free grammar specifically constructed to generate all strings in \(\Sigma^*)\).

Previous: Rice’s Theorem

Next: Post Correspondence Problem